Goto

Collaborating Authors

 label fraction



ray tuning models

Neural Information Processing Systems

The class distribution of smaller datasets match the class distribution of the complete dataset. Weperformed apreliminary ablation analysis with oneofthedataset, NIH-Chest Xray dataset, to understand towhich blocks ofResNet-50 should we apply the intermediate loss. Theclassdistribution of smaller datasets match the class distribution of the complete dataset. Theclassdistribution of smaller datasets match the class distribution of the complete dataset. The preliminary ablation study gave the evidence that applying intermediate loss to all blocks yielded superior results.



fcbc95ccdd551da181207c0c1400c655-Supplemental.pdf

Neural Information Processing Systems

A When Do Bigger Models Help More? Figure A.1 shows relative improvement by increasing the model size under different amount of It is also worth noting that these results may reflect a "ceiling effect": as the performance gets closer Figure A.1: Relative improvement (top-1) when model size is increased. Figure B.1 shows the top-1 accuracy of fine-tuned SimCLRv2 models of different sizes. For fine-tuning on 1% of labels, SK is much more efficient. Figure C.1 shows the correlation under two different fine-tuning strategies: We observe that overall there is a linear correlation. Furthermore, as label fraction increases, the slope is decreasing.




Shifting to Machine Supervision: Annotation-Efficient Semi and Self-Supervised Learning for Automatic Medical Image Segmentation and Classification

arXiv.org Artificial Intelligence

Advancements in clinical treatment are increasingly constrained by the limitations of supervised learning techniques, which depend heavily on large volumes of annotated data. The annotation process is not only costly but also demands substantial time from clinical specialists. Addressing this issue, we introduce the S4MI (Self-Supervision and Semi-Supervision for Medical Imaging) pipeline, a novel approach that leverages the advancements in self-supervised and semi-supervised learning. These techniques engage in auxiliary tasks that do not require labeling, thus simplifying the scaling of machine supervision compared to fully-supervised methods. Our study benchmarks these techniques on three distinct medical imaging datasets to evaluate their effectiveness in classification and segmentation tasks. Remarkably, we observed that self-supervised learning with only 10% of the annotation surpassed the performance of full annotation in the classification of most datasets. Similarly, the semi-supervised approach demonstrated superior outcomes in segmentation, outperforming fully-supervised methods with 50% fewer labels across all datasets. In line with our commitment to contributing to the scientific community, we have made the S4MI code openly accessible, allowing for broader application and further development of these methods.


MoCo-Transfer: Investigating out-of-distribution contrastive learning for limited-data domains

arXiv.org Artificial Intelligence

Medical imaging data is often siloed within hospitals, limiting the amount of data available for specialized model development. With limited in-domain data, one might hope to leverage larger datasets from related domains. In this paper, we analyze the benefit of transferring self-supervised contrastive representations from moment contrast (MoCo) pretraining on out-of-distribution data to settings with limited data. We consider two X-ray datasets which image different parts of the body, and compare transferring from each other to transferring from ImageNet. We find that depending on quantity of labeled and unlabeled data, contrastive pretraining on larger out-of-distribution datasets can perform nearly as well or better than MoCo pretraining in-domain, and pretraining on related domains leads to higher performance than if one were to use the ImageNet pretrained weights. Finally, we provide a preliminary way of quantifying similarity between datasets.


CASS: Cross Architectural Self-Supervision for Medical Image Analysis

arXiv.org Artificial Intelligence

Recent advances in deep learning and computer vision have reduced many barriers to automated medical image analysis, allowing algorithms to process label-free images and improve performance. However, existing techniques have extreme computational requirements and drop a lot of performance with a reduction in batch size or training epochs. This paper presents Cross Architectural - Self Supervision (CASS), a novel self-supervised learning approach that leverages Transformer and CNN simultaneously. Compared to the existing state of the art self-supervised learning approaches, we empirically show that CASS-trained CNNs and Transformers across four diverse datasets gained an average of 3.8% with 1% labeled data, 5.9% with 10% labeled data, and 10.13% with 100% labeled data while taking 69% less time. We also show that CASS is much more robust to changes in batch size and training epochs. Notably, one of the test datasets comprised histopathology slides of an autoimmune disease, a condition with minimal data that has been underrepresented in medical imaging. The code is open source and is available on GitHub.


Multi-Feature Vision Transformer via Self-Supervised Representation Learning for Improvement of COVID-19 Diagnosis

arXiv.org Artificial Intelligence

The role of chest X-ray (CXR) imaging, due to being more cost-effective, widely available, and having a faster acquisition time compared to CT, has evolved during the COVID-19 pandemic. To improve the diagnostic performance of CXR imaging a growing number of studies have investigated whether supervised deep learning methods can provide additional support. However, supervised methods rely on a large number of labeled radiology images, which is a time-consuming and complex procedure requiring expert clinician input. Due to the relative scarcity of COVID-19 patient data and the costly labeling process, self-supervised learning methods have gained momentum and has been proposed achieving comparable results to fully supervised learning approaches. In this work, we study the effectiveness of self-supervised learning in the context of diagnosing COVID-19 disease from CXR images. We propose a multi-feature Vision Transformer (ViT) guided architecture where we deploy a cross-attention mechanism to learn information from both original CXR images and corresponding enhanced local phase CXR images. We demonstrate the performance of the baseline self-supervised learning models can be further improved by leveraging the local phase-based enhanced CXR images. By using 10\% labeled CXR scans, the proposed model achieves 91.10\% and 96.21\% overall accuracy tested on total 35,483 CXR images of healthy (8,851), regular pneumonia (6,045), and COVID-19 (18,159) scans and shows significant improvement over state-of-the-art techniques. Code is available https://github.com/endiqq/Multi-Feature-ViT